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All pending claims are reproduced below. Marked-up copies of the amended claims are 
provided in the Appendix to this Response. 
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A file content classification system comprising: 

a plurality of agents, each agent including a file content ID generator creating 
file content IDs using a mathematical algorithm, at least one agent provided on one of 
a plurality of clients; 

an ID appearance database, provided on a server, coupled to receive file 
content IDs from the agents; and 

a characteristic comparison routine on the server, identifying a characteristic 
of the file content based on the appearance of the file content ID in the appearance 
database and transmitting the characteristic to the client agents. 



The content classification system of claim J^Fwherein said ID generator 
comprises a hashing algorithm, ^ 

So. The content classification system of claim 3S wherein said hashing algorithm 



is the MD5 hashing algorithm. 

The content classification system of claim jp^ wherein said ID appearance 
database tracks the frequency of appearance of a digital j[D. 

The content classification system of claini^f wherein said plurality of agents 
are coupled to said database via a combination of public and private networks. 

'^6. The content classification system of clainf^35 wherein said database is 
coupled to an intermediate server which is coupled to said plurality of agents. 

yf. The content classification system of claim -^wherein said intermediate 
server is a web server. 
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The content classification system of claimjrf wherein said characteristic 
comprises junk e-mail and said characteristic is defined by a frequency of appearance 
of a file content ID. 

A method for identifying characteristics of data files, comprising: 

receiving, on a processing system, file content identifiers for data files from a 
plurality of file content identifier generator agents, each agent provided on a source 
system and creating file content IDs using a mathematical algorithm, via a network; 

determining, on the processing system, whether each received content 
identifier matches a characteristic of other identifiers; and 

outputting, to at least one of the source systems responsive to a request from 
said source system, an indication of the characteristic of the data file based on said 
step of determining. 

7 F~ lid 7 ? 



The method of ckim^f wherein said file content identifier generates an 
identifier by hashing at least a portion of the data file. 




aim^O wherein saic 



The method of claimed wherein said hashing comprises using the MD5 hash. 




2. The method of claimXfu wherein said step of generating comprises hashing 
multiple portions of the data file, q 

^ The method of claim ^9 wherein each said data file is an email message and 
said step of determining comprises determining whether said email is SPAM. 

The method of claim 26 wherein said step of determining identifies said e- 
mail as SPAM by tracking the rate per unit time a digital ID is generated. 






The method of claim jjfo wherein said method further includes the step of 
instructing said plurality of source systems to perform an action with the email based 
on said determining step. 
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A method of filtering am email message, comprising: 

receiving, on a second computer, a digital content identifier created using a 
mathematical algorithm unique to the message content from at least two of a plurality 
^ of first computers having digital content ID generator agents; 

comparing, on the second computer, the digital content identifier to a 
characteristic database of digital content identifiers received from said plurality of 
first computers to determine whether the message has a characteristic; and 

responding to a query from at least one of said plurality of computers to 
identify the existence or absence of said characteristic of the message based on said 



comparing. 

^ The method of claimed wherein said second computer is coupled to said 
plurality of first computers b/a combination of public and private networks 

/&. The method of claim An wherein said step of receiving includes receiving 
identifiers from said plurality of first systems via an intervening Web server 

■jar The method of clairr^o wherein said plurality of systems are coupled by the 
Internet. 

The method of claim 4<^wherein said step of comparing comprises 
determining the frequency of a particular ID occurring in a time period, classifying 
said ID as having a characteristic, and comparing digital content identifiers to said 
classified IDs. 

5p. A file content classification system for a first computer and a second 
computer coupled by a network, comprising: 

a client agent file content identifier generator on the first computer, the file 
content identifier comprising a computed value of at least two non-contiguous 
sections of data in a file; and 
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a server comparison agent and data-structure on the second computer 
receiving identifiers from the client agent and providing replies to the client agent; 

wherein the client agent processes the file based on replies from the server 
comparison agent. 



A method for providing a service on the Internet, comprising: 
collecting data on a processing system from a plurality of systems having a 
client agent generating digital content identifiers created using a mathematical 
algorithm for each of a plurality of files on the Internet to a server having a database; 

characterizing the files on the server system based on said digital content 
identifiers received relative to other digital content identifiers collected in the 
database; and 

transmitting a substance identifier from the server to the client agent 
indicating the pres ence or absence of a characteristic in the file. 

The method of claim^wherem said step of collecting comprises collecting a 
digital identifier for a data file/ 3 

The method of claim ^wherein said file content is an e-mail 

The method of claurf^r therein said step of characterizing comprises: 
tracking the frequency of the collection of a particular identifier; 
characterizing the data file based on said frequency; 
storing the characterization; and 

comparing collected identifiers to the known characterization. 
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